NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Covariance loss, Szemeredi regularity, and differential privacy

https://doi.org/10.1090/proc/17126

Boedihardjo, March; Strohmer, Thomas; Vershynin, Roman (February 2025, Proceedings of the American Mathematical Society)

We show how randomized rounding based on Grothendieck’s identity can be used to prove a nearly tight bound on the covariance loss–the amount of covariance that is lost by taking conditional expectation. This result yields a new type of weak Szemeredi regularity lemma for positive semidefinite matrices and kernels. Moreover, it can be used to construct differentially private synthetic data.
more » « less
Free, publicly-accessible full text available February 1, 2026
Differentially private low-dimensional synthetic data from high-dimensional datasets

https://doi.org/10.1093/imaiai/iaae034

He, Yiyun; Strohmer, Thomas; Vershynin, Roman; Zhu, Yizhe (January 2025, Information and Inference: A Journal of the IMA)

Differentially private synthetic data provide a powerful mechanism to enable data analysis while protecting sensitive information about individuals. However, when the data lie in a high-dimensional space, the accuracy of the synthetic data suffers from the curse of dimensionality. In this paper, we propose a differentially private algorithm to generate low-dimensional synthetic data efficiently from a high-dimensional dataset with a utility guarantee with respect to the Wasserstein distance. A key step of our algorithm is a private principal component analysis (PCA) procedure with a near-optimal accuracy bound that circumvents the curse of dimensionality. Unlike the standard perturbation analysis, our analysis of private PCA works without assuming the spectral gap for the covariance matrix.
more » « less
Full Text Available
Monotone Operator Theory-Inspired Message Passing for Learning Long-Range Interaction on Graphs

Baker, Justin M; Wang, Qingsong; Berzins, Martin; Strohmer, Thomas; Wang, Bao (July 2024, Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR)

Full Text Available
Private measures, random walks, and synthetic data

Boedihardjo, March; Strohmer, Thomas; Vershynin, Roman (April 2024, Probability theory and related fields)

Full Text Available
Private measures, random walks, and synthetic data

https://doi.org/10.1007/s00440-024-01279-z

Boedihardjo, March; Strohmer, Thomas; Vershynin, Roman (April 2024, Probability Theory and Related Fields)

Abstract Differential privacy is a mathematical concept that provides an information-theoretic security guarantee. While differential privacy has emerged as a de facto standard for guaranteeing privacy in data sharing, the known mechanisms to achieve it come with some serious limitations. Utility guarantees are usually provided only for a fixed, a priori specified set of queries. Moreover, there are no utility guarantees for more complex—but very common—machine learning tasks such as clustering or classification. In this paper we overcome some of these limitations. Working with metric privacy, a powerful generalization of differential privacy, we develop a polynomial-time algorithm that creates aprivate measurefrom a data set. This private measure allows us to efficiently construct private synthetic data that are accurate for a wide range of statistical analysis tools. Moreover, we prove an asymptotically sharp min-max result for private measures and synthetic data in general compact metric spaces, for any fixed privacy budget$$\varepsilon $$ $ε$ bounded away from zero. A key ingredient in our construction is a newsuperregular random walk, whose joint distribution of steps is as regular as that of independent random variables, yet which deviates from the origin logarithmically slowly.
more » « less
Monotone Operator Theory-Inspired Message Passing for Learning Long-Range Interaction on Graphs

Baker, Justin M; Wang, Qingsong; Berzins, Martin; Strohmer, Thomas; Wang, Bao (May 2024, PMLR)

Full Text Available
Monotone Operator Theory-Inspired Message Passing for Learning Long-Range Interaction on Graphs

Baker, Justin; Wang, Qingsong; Berzins, Martin; Strohmer, Thomas; Wang, Bao (April 2024, International Conference on Artificial Intelligence and Statistics)

Full Text Available
Monotone Operator Theory-Inspired Message Passing for Learning Long-Range Interaction on Graphs

Baker, Justin Baker; Wang, Qingoson Wang; Berzins, Martin; Strohmer, Thomas; Wang, Bao (May 2024, International Conference on Artificial Intelligence and Statistics (AISTATS))

Full Text Available
Covariance's Loss is Privacy's Gain: Computationally Efficient, Private and Accurate Synthetic Data

Boedihardjo, March; Strohmer, Thomas; Vershynin, Roman (February 2024, Foundations of Computational Mathematics)

Full Text Available
Fair Data Representation for Machine Learning at the Pareto Frontier

Xu, Shizhou; Strohmer, Thomas (November 2023, Journal of machine learning research)

Full Text Available

« Prev Next »

Search for: All records